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ABSTRACT 



A method and apparatus for performing storage and retrieval 
in an information storage system is disclosed that uses the 
hashing technique with the external chaining method for 
collision resolution. In order to prevent performance dete- 
rioration due to the presence of automatically expiring data 
items, a garbage collection technique is used that removes 
all expired records stored in the system in the external chain 
targeted by a probe into the data storage system. More 
particularly, each insertion, retrieval, or deletion of a record 
is an occasion to search an entire linked-list chain of records 
for expired items and then remove them. Because an expired 
data item will not remain in the system long term if the 
system is frequently probed, it is useful for large information 
storage systems that are heavily used, require the fast access 
provided by hashing, and cannot be taken off-line for 
removal of expired data. 

8 Claims, 6 Drawing Sheets 
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METHODS AND APPARATUS FOR 
INFORMATION STORAGE AND RETRIEVAL 
USING A HASHING TECHNIQUE WITH 
EXTERNAL CHAINING AND ON-THE-FLY 
REMOVAL OF EXPIRED DATA 

CROSS-REFERENCE TO RELATED 
APPLICATIONS 

Not Applicable 

STATEMENT REGARDING FEDERALLY 
SPONSORED RESEARCH OR DEVELOPMENT 

Not Applicable 

REFERENCE TO A MICROFICHE APPENDIX 
Not Applicable 

BACKGROUND OF THE INVENTION 

This invention relates to information storage and retrieval 
systems, and, more particularly, to the use of hashing 
techniques in such systems. 

Information or data stored in a computer-controlled stor- 
age mechanism can be retrieved by searching for a particular 
key value in the stored records. The stored record with a key 
matching the search key value is then retrieved. Such 
searching techniques require repeated access to records into 
the storage mechanism to perform key comparisons. In large 
storage and retrieval systems, such searching, even if aug- 
mented by efficient search procedures such as the binary 
search, often requires an excessive amount of time due to the 
large number of key comparisons required. 

Another well-known and much faster way of storing and 
retrieving information from computer storage, albeit at the 
expense of additional storage, is the so-called "hashing" 
technique, also called scatter-storage or key-transformation 
method. In such a system, the key is operated on by a 
hashing function to produce a storage address in the storage 
space, called the hash table, which is a large one- 
dimensional array of record locations. This storage address 
is then accessed directly for the desired record. Hashing 
techniques are described in the classic text by D. E. Knuth 
entitled The Art of Computer Programming, Volume 3, 
Sorting and Searching, Addison-Wesley, Reading, Mass., 
1973, pp. 506-549. 

Hashing functions are designed to translate the universe 
of keys into addresses uniformly distributed throughout the 
hash table. Typical hashing functions include truncation, 
folding, transposition, and modulo arithmetic. A disadvan- 
tage of hashing is that more than one key will inevitably 
translate in the same storage address, causing "collisions" in 
storage. Some form of collision resolution must therefore be 
provided. For example, the simple strategy called "linear 
probing," which consists of searching forward from the 
initial storage address to the first empty storage location, is 
often used. 

Another method for resolving collisions is called "exter- 
nal chaining." In this technique, each hash table location is 
a pointer to the head of a linked list of records, all of whose 
keys translate under the hashing function to that very hash 
table address. The linked list is itself searched sequentially 
when retrieving, inserting, or deleting a record. Insertion and 
deletion are done by adjusting pointers in the linked list. 
External chaining is discussed in considerable detail in the 
aforementioned text by D. E. Knuth, in Data Structures and 
Program Design, Second Edition, by R. L. Kruse, Prentice- 
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Hall, Incorporated, Englewood Cliffs, N.J., 1987, Section 
6.5, "Hashing," and Section 6.6, "Analysis of Hashing," pp. 
198-215, and in Data Structures with Abstract Data Types 
and Pascal, by D. F. Stubbs and N. W. Webre, Brooks/Cole 

5 Publishing Company, Monterey, Calif, 1985, Section 7.4, 
"Hashed Implementations," pp. 310-336. 

Some forms of information are such that individual data 
items, after a limited period of time, become obsolete, and 
their presence in the storage system is no longer needed or 

10 desired. Scheduling activities, for example, involve data that 
become obsolete once the scheduled event has occurred. An 
automatically-expiring data item, once it expires, needlessly 
occupies computer memory storage that could otherwise be 
put to use storing an unexpired item. Thus, expired items 

15 must eventually be removed to reclaim the storage for 
subsequent data insertions. In addition, the presence of many 
expired items results in needlessly long search times since 
the linked lists associated with external chaining will be 
longer than they otherwise would be. The goal is to remove 

20 these expired items to reclaim the storage and maintain fast 
access to the data. 

The problem, then, is to provide the speed of access of 
hashing techniques for large, heavily used information stor- 
age systems having expiring data and, at the same time, 
prevent the performance degradation resulting from the 
accumulation of many expired records. Although a hashing 
technique for dealing with expiring data is known and 
disclosed in U.S. Pat. No. 5,121,495, issued Jun. 9, 1992, 
that technique is confined to linear probing and is entirely 

30 inapplicable to external chaining. The procedure shown 
there traverses, in reverse order, a consecutive sequence of 
records residing in the hash table array, continually relocat- 
ing unexpired records to fill gaps left by the removal of 

35 expired ones. 

Unlike arrays, linked lists leave no gaps when items from 
it are removed, and furthermore it is not possible to effi- 
ciently traverse a singly linked list in reverse order. There are 
significant advantages to external chaining over linear prob- 

40 ing that sometimes make it the method of choice, as dis- 
cussed in considerable detail in the aforementioned texts, 
and so hashing techniques for dealing with expiring data that 
do not use external chaining prove wholly inadequate for 
certain applications. For example, if the data records are 

45 large, considerable memory can be saved using external 
chaining instead of linear probing. Accordingly, there is a 
need to develop hashing techniques for external chaining 
with expiring data. The methods of the above-mentioned 
patent are limited to arrays and cannot be used with linked 

5Q lists due to the significant difference in the organization of 
the computer's memory. 

BRIEF SUMMARY OF THE INVENTION 

In accordance with the illustrative embodiment of the 
55 invention, these and other problems arc overcome by using 
a garbage collection procedure "on-the-fly" while other 
types of access to the storage space are taking place. In 
particular, during normal data insertion or retrieval probes 
into the data store, the expired, obsolete records are identi- 
60 fied and removed from the external chain linked list. 
Specifically, expired or obsolete records in the linked list 
including the record to be accessed are removed as part of 
the normal search procedure. 
This incremental garbage collection technique has the 
65 decided advantage of automatically eliminating unneeded 
records without requiring that the information storage sys- 
tem be taken off-line for such garbage collection. This is 
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particularly important for information storage systems Central Processing Unit (CPU) 10 also controls an Input/ 

requiring rapid access and continuous availability to the user Output (I/O) controller 14 that, in turn, provides access to a 

population. plurality of input devices such as CRT (cathode ray tube) 

More specifically, a method for storing and retrieving terminal 15, as well as a plurality of output devices such as 

information records using a linked list to store and provide 5 printer 16. Terminal 15 provides a mechanism for a com- 

access to the records, at least some of the records automati- puter user to introduce instructions and commands into the 

cally expiring, is disclosed. The method accesses the linked computer system of FIG. 1, and may be supplemented with 

list of records and identifies at least some automatically otner m P ut devices such as magnetic tape readers, remotely 

expired ones of the records. It also removes at least some located terminals, optical readers, and other types of input 

automatically expired ones of the record from the linked list 10 devices. Similarly, printer 16 provides a mechanism for 

when the linked list is accessed. Furthermore, the method displaying the results of the operation of the computer 

provides for dynamically determining maximum number of system of FIG. 1 for the computer user. Printer 16 may 

expired ones of the records to be removed when the linked similarly be supplemented by line printers, cathode ray tube 

list is accessed. displays, phototypesetters, laser printers, graphical plotters, 

15 and other types of output devices. 

BRIEF DESCRIPTION OF THE SEVERAL The constituents of the computer system of FIG. 1 and 

VIEWS OF THE DRAWING their cooperative operation are well-known in the art and are 

Acomplete understanding of the present invention may be ty P ical , of f U mm P?f s y stems > from ^m^ P«*»»l corn- 
gained by considering the following detailed description in 20 puters . t0 la ?& mainframe s y ste *f; architecture and 
conjunction with the accompanying drawing, in which: 2 ° ^J*™ of ^W**™ ™ well-known and will not be 
' . . i u, i j- c further described here. 

FIG. 1 shows a general block diagram of a computer - , L . „ . „ 

system hardware arrangement in which the information r FIG * 2 sl * ows a STaphical representation of a typical 

storage and retrieval system of the present invention might s ° ftware a ^ lte , Ct ^ e for a computer system such as that 

be implemented; 25 shown in FIG * 1 ^ of F *G. 2 comprises a user 

CT „ - . ' i i_ i i j ' & access mechanism that, for simple personal computers, may 

FIG. 2 shows a general block diagram of a computer consist of nothi more than turni ^ m ^ , n j > 

system software arrangement in which the information stor- tems> providing service to ^ , in and * 

age and retneval system of the present invention might find word procedures would typically ^ implemented in user 

' access mechanism 20. Once user access mechanism 20 has 

FIG. 3 shows a general flow chart for a table searching 30 completed the login procedure, the user is placed in the 

operation that might be used in a hashed storage system in operating system environment 21. Operating system 21 

accordance with the present invention; coordinates the activities of all of the hardware components 

FIG. 4 shows a general flow chart for a linked-list element of the computer system (shown in FIG. 1) and provides a 

remove procedure that forms part of the table searching number of utility programs 22 of general use to the computer 

operation of FIG. 3; 35 user. Utilities 22 might, for example, comprise basic file 

FIG. 5 shows a general flow chart for a record insertion access and manipulation programs, system maintenance 

operation that might be used in a hashed storage system in facilities, and programming language compilers, 

accordance with the present invention; The computer software system of FIG. 2 typically also 

FIG. 6 shows a general flow chart for a record retrieval 40 includes application programs such as application software 

operation for use in a hashed storage system in accordance 23 > 24, . . . , 25. Application software 23 through 25 might, 

with the present invention; and for example, comprise a text editor, document formatting 

FIG. 7 shows a general flow chart for a record deletion *°ft™ n > a spreadsheet program a database management 

operation that might be used in a hashed storage system in system ' a game P ro S ram ' aad 50 forth - 

accordance with the present invention. 45 T° e present invention is concerned with information 

To facilitate reader understanding, identical reference !?™f and ret ™ val A [ can be motion software packages 

numerals are used to designate elements common to the 23 ^ 25 ' °' used b y other ^ of the s 3* tem ' such as uscr 

g gures access software 20 or operating system 21 software. The 

information storage and retrieval technique provided by the 

DETAILED DESCRIPTION OF THE 50 present invention are herein disclosed as flowcharts in FIGS. 

INVENTION 3 through 7, and shown as PASCAL- like pseudocode in the 

APPENDIX to this specification. 

FIG. 1 of the drawings shows a general block diagram of Before proceeding to a description of one embodiment of 

a computer hardware system comprising a Central Process- the present invention, it is first useful to discuss hashing 

ing Unit (CPU) 10 and a Random Access Memory (RAM) 55 techniques in general. Many fast techniques for storing and 

unit 11. Computer programs stored in the RAM 11 are retrieving data are known in the prior art. In situations where 

accessed by CPU 10 and executed, one instruction at a time, st0 rage space is considered cheap compared with retrieval 

by CPU 10. Data, stored in other portions of RAM 11, are ^ a technique called hashing is often used. In classic 

operated on by the program instructions accessed by CPU 10 hashing, each record in the information storage system 

from RAM 11, all m accordance with well-known data 60 includes a distinguished field unique in value to each record, 

processing techniques. called the kev> which ^ ^ as (he basis for storiog and 

Central Processing Unit (CPU) 10 also controls and retrieving the associated record. Taken as a whole, a hash 

accesses a disk controller unit 12 that, in turn, accesses a table is a large, one-dimensional array of logically 

digital data stored on one or more disk storage units such as contiguous, consecutively numbered, fixed-size storage 

disk storage unit 13 until required by CPU 10. At this time, 65 units. Such a table of records is typically stored in RAM 11 

such programs and data are retrieved from disk storage unit of FIG. 1, where each record is an identifiable and addres- 

13 in blocks and stored in RAM 11 for rapid access. sable location in physical memory. A hashing function 
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translates the key into a hash table array subscript, which is successful and returns success in box 35, followed by the 

used as an index into the array where searches for the data procedure's termination in terminal box 37. If not, box 36 is 

record begin. The hashing function can be any operation on entered where failure is returned and the procedure again 

the key that results in subscripts mostly uniformly distrib- terminates in box 37. 

uted across the table. Known hashing functions include < lf „ nA „ f f - , u„„ *u _ « i , . . , 

truncation, folding, transposition, modulo arithmetic, and . « the end of the to has not been reached as determined 

combinations of these operations. Unfortunately, hashing b / decision box 33 decision box 38 isentered to determine 

functions generally do not produce unique locations in the rf the record P° mted to »>« MfireA Uns is ^determined by 

hash table, in that many distinct keys map to the same com P ann g "™ P° rt ?° n °f contents of the record to 

location, producing what are called collisions. Some form of 10 aome external condition. A timestamp in the record for 

collision resolution is required in all hashing systems. In 10 «fmple, could be compared with the current time-of-day 

every occurrence of collision, finding an alternate location value ™"tained * all computers Alternatively me occur, 

for a collided record is necessary. Moreover, the alternate f* nce of an . ev ' nt * fompuei » **M identifying 

location must be readily reachable during future searches for ,hat . ev f n » in lhe any case, ,f the record has not 

the displaced record. expired, decision box 39 is entered to determine if the key 

» „. .' ... . . 15 in this record matches the search key. If it does, the address 

A common collision resolution strategy, with which the f ,. . • . j« .J i ,. . j.-T^ 

... j • ii j i \. , i. ■ • of the record is saved m box 40 and box 41 is entered. If the 

present invention is concerned, is called external chaining. „„ . . „ „„, „„, . ^ . . ,. ,&u " lut 

Under external chaining, each hash table entry stores all of '^ifhnv T,T ? h ?. i I' ai ?"£ Ji 

the records that collided at that location by storing not the bypaSSeS *° x 40 . and pr0C f eeds d ™ ll l to bo * 4L Inbox 

records themselves, but instead a pointer to the head of a 20 ^ P^dure advances forward to the next record in the 

r„w v ♦ p *u a e u i- i a i- . 2 hnked hst and the procedure returns to box 33. 
linked list or. those same records. Such linked lists are 

formed by storing the records individually in dynamically If decision box 38 determines that the record under 

allocated storage and maintaining with each record a pointer question has expired, box 42 is entered to perform the 

to the location of the next record in the chain of collided on-the-fly removal of the expired record from the linked list 

records. When a search key is hashed to a hash table entry, 25 and me return of the stora S e 11 occu P*s to the system storage 

the pointer found there is used to locate the first record. If the P° o1 ' as wlU be described m connection with FIG. 4. In 

search key does not match the key found there, the pointer general, the remove procedure of box 42 (FIG. 4) operates 

there is used to locate the second record. In this way, the to rcmove an element from the linked list by adjusting its 

"chain" of records is traversed sequentially until the desired predecessor's pointer to bypass that element. (However, if 

record is found or until the end of the chain is reached. 30 the element to be removed is the first element of the list, then 

Deletion of records involves merely adjusting the pointers to there IS no predecessor and the hash table array entry is 

bypass the deleted record and returning the storage it occu- adjusted instead.) On completion of procedure remove 

pied to the available storage pool maintained by the system. mv °ked from box 42, the search table procedure returns to 

Hashing techniques have been used classically for very box ^* 

fast access to static, short term data such as a compiler 35 11 can be seen that the search table procedure of FIG. 3 

symbol table. Typically, in such storage systems, deletions operates to examine the entire linked list of records of which 

are infrequent and the need for the storage system disappears lDe searched-for record is a part, and to remove expired 

quickly. In some common types of data storage systems, records, returning storage to the storage pool with each 

however, the storage system is long lived and records can removal. If the storage pool is depleted and many expired 

become obsolete merely by the passage of time or by the 40 recor ds remain despite such automatic garbage collection, 

occurrence of some event. If such expired, lapsed, or obso- then the insertion of new records is inhibited (boxes 76 and 

lete records are not removed from the system, they will, in 77 of FIG - 5 ) until a deletion is made by the delete procedure 

time, seriously degrade the performance of the retrieval (FIG. 7) or until the search table procedure has had a chance 

system. Degradation shows up in two ways. First, the t0 replenish the storage pool through its on-the-fly garbage 

presence of expired records lengthens search times since 45 collection. 

they cause the external chains to be longer than they Though the search table procedure as shown in FIG. 3, 

otherwise would be. Second, expired records occupy implemented in the APPENDIX as PASCAL-like 

dynamically allocated memory storage that could be pseudocode, and described above appears in connection 

returned to the system memory pool for useful allocation. with an information storage and retrieval system using the 

Thus, when the system memory pool is depleted, a new data 50 hashing technique with external chaining, its on-the-fly 

item can be inserted into the storage system only if the removal technique while traversing a linked list can be used 

memory occupied by an expired one is reclaimed. anywhere a linked list of records with expiring data appears, 

Referring then to FIG. 3, there is shown a flowchart of a eve n in contexts unrelated to hashing. A person skilled in the 

search table procedure for searching the hash table prepa- fl rt will appreciate that this technique can be readily applied 

ratory to inserting, retrieving, or deleting a record, in accor- 55 to manipulate linked lists not necessarily used with hashing, 

dance with the present invention, and involving the dynamic The search table procedure shown in FIG. 3, implemented 

removal of expired records in a targeted linked list. Starting as pseudocode in the APPENDIX, and described above 

in box 30 of the search table procedure of FIG. 3, the search traverses the entire linked list removing all expired records 

key of the record being searched for is hashed in box 31 to as it searches for a key match. The procedure can be readily 

provide the subscript of an array element. In box 32, the hash 60 adapted to remove some but not all of the expired records, 

table array location indicated by the subscript generated in thereby shortening the linked list traversal time and speeding 

box 31 is accessed to provide the pointer to the target linked up the search at the expense of perhaps leaving some expired 

list. Decision box 33 examines the pointer value to deter- records in the list. For example, the procedure can be 

mine whether the end of the linked list has been reached. If modified to terminate when a key match occurs. (PASCAL- 

the end has been reached, decision box 34 is entered to 65 like pseudocode for this alternate version of search table 

determine if a key match was previously found in decision appears in the APPENDIX.) The implementor even has the 

box 39 (as will be described below). If so, the search is prerogative of choosing among these strategies dynamically 
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at the time search table is invoked by the caller, thus 5 begins at staring box 70 from which box 71 is entered. In 

sometimes removing all expired records, at other times box 71, the search table procedure of FIG. 3 is invoked with 

removing some but not all of them, and yet at other times the search key of the record to be inserted. As noted in 

choosing to remove none of them. Such a dynamic runtime connection with FIG. 3, the search table procedure finds the 

decision might be based on factors such as, for example, 5 linked list element whose key value of the record contained 

how much memory is available in the system storage pool, therein matches the search key and, at the same time, 

general system load, time of day, the number of records removes expired records on-the-fly from that linked list, 

currently residing in the information system, and other Decision box 72 is then entered where it is determined 

factors both internal and external to the information storage whet ^ r t 1 he sc " ch table procedure found a record with 

and retrieval system itself A person skilled in the art will 10 value. If so, bo* :73 is entered where the record 

™r^\^.i, ft fli, a+ u • f ii • j j to be inserted is put into the linked list element in the 

appreciate that the technique of removing all expired records uion rf ^ ol / r6cord ^ ^ ^ 

white searching the linked hst can be expanded to include ^ (he ^ dure ^ ^ «> „ ^ has ^ 

techniques whereby not necessanly all expired records are laced b the new record and the procedure terminates in 

removed, and that the decision regarding if and how many terminal box 75 

records to delete can be a dynamic one. 35 Returning t0 decision box 72, if a matching record is not 

In FIG. 4 there is shown a flowchart of a remove proce- found, decision box 76 is entered to determine if there is 

dure that removes a record from the retrieval system, either sufficient storage in the system storage pool to accommodate 

an unexpired record through the delete procedure as will be a new linked list element. If not, box 77 is entered to report 

noted in connection with FIG. 7, or an expired record that the storage system is full and the record cannot be 

through the search table procedure as noted in connection 20 inserted. Following that, the procedure terminates in termi- 

with FIG. 3. In general, this is accomplished by the invoking nal box 75. 

procedure, being either the delete procedure (FIG. 7) or the if decision box 76 determines that sufficient storage can 

search table procedure (FIG. 3), passing to the remove be allocated from the system storage pool for a new linked 

procedure a pointer to a linked list element to remove, a Hst element, then box 78 is entered where the actual memory 

pomter to that element's predecessor element in the same 25 allocation is made. In box 79, the record to be inserted is 

linked list, and the subscript of the hash table array location copied into the storage allocated in box 78, and box 80 is 

containing the pomter to the head of the linked list from entered. In box 80, the linked list element containing the 

which the element is to be removed. In the case that the reC ord copied into it in box 79 is inserted into the linked list 

element to be removed is the first element of the linked list, to which the contained record hashed. The procedure then 

the predecessor pointer passed to the remove procedure by reports that the record was inserted into the information 

the invoking procedure has the NIL (sometimes called storage and retrieval system in box 81 and the procedure 

NULL, or EMPTY) value, indicating to the remove proce- terminates in box 75 

dure that the element to be removed has no predecessor in FIG 6 shows a delailed flowchart of a retr ieve procedure 

the list. The invoking procedure expects the remove used t0 retrieve a record from the information storage and 

procedure, on completion, to have advanced the passed retrieval system. Starting in box 90, the search table proce- 

pointer that originally pointed to the now-removed element 35 dufe of FIQ 3 {& [q box ^ rf ^ 

so that it points to the successor element in that linked hst, record to 5e retrieved as the k In % cisio > box 92 

° r k rem 7 ed e l e ^ Dt , WaS *** -^f 1 dement - ^ il is determined if a record with a matching key was found 

search table procedure of FIG. 3, in particular, makes use of by the search table procedure . If not ^ 93 £ entered t0 

the remove procedure s advancing this passed pointer in the re ort failure of me retrieve procedure and lhe procedure 

described way; it is made use of m that box 33 of FIG. 3 is 40 terminates in terminal box 96. If a matching record was 

entered directly following completion of box 42, as was found , box 94 ^ entered t0 the matcrdn g record mt o a 

described above in connection with FIG. 3.) record store fof processing b / t he calling program, box 95 

The remove procedure causes actual removal of the is entered to return an indication of successful retrieval, and 
designated element by adjusting the predecessor pointer so the procedure terminates in terminal box 96. 
that it bypasses the element to be removed. In the case that 45 FIG.7 shows a detailed flowchart of a delete procedure 
the predecessor pointer has the NIL value, the hash table usefu i for actively removing records from the information 
array entry indicated by the passed subscript plays the role storage retrieval system. Starting at box 100, the pro- 
of the predecessor pointer and is adjusted the same way in cedure of nG. 7 invokes the search table procedure of FIG, 
its stead. Following pointer adjustments, the storage occu- 3 in box 101, using the key of the record to be deleted as the 
pied by the removed e ement is returned to the system 5Q search key. In decision box 102, it is determined if the search 
storage pool for future allocation. t able procedure was able to find a record with matching key. 

Beginning, then, at starting box 50 of FIG. 4, the pointer If not, box 103 is entered to report failure of the deletion 

to the list element to remove is advanced in box 51 so that procedure, and the procedure terminates in terminal box 

it points to its successor in the linked list. Next, decision box 106. If a matching record was found, as determined by 

52 determines if the element to remove is the first element decision box 102, the remove procedure of FIG. 4 is invoked 

in the containing linked list by testing the predecessor in box 104. As noted in connection with FIG. 4, the remove 

pointer for the NIL value, as described above. If so, box 54 procedure causes removal of a designated linked list element 

is entered to adjust the linked list head pointer in the hash from its containing linked list. Box 105 is then entered to 

table array to bypass the first element, after which the report successful deletion to the calling program, and the 

procedure continues on to box 55. If not, box 53 is entered procedure terminates in terminal box 106. 

where the predecessor pointer is adjusted to bypass the ™ The attached APPENDIX contains PASCAL-like 

element to remove, after which the procedure proceeds, once pseudocode listings for all of the programmed components 

again, to box 55. Finally, in box 55 the storage occupied by necessary to implement an information storage and retrieval 

the bypassed element is returned to the system storage pool sys tem operating in accordance with the present invention, 

and the procedure terminates in terminal box 56. Any person of ordinary skill in the art will have no difficulty 

FIG. 5 shows a detailed flowchart of an insert procedure 65 implementing the disclosed system and procedures shown in 
suitable for use in the information storage and retrieval the APPENDIX, including programs for all common hard- 
system of the present invention. The insert procedure of FIG. ware and system software arrangements, on the basis of this 
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description, including flowcharts and information shown in 
the APPENDIX. 

It should also be clear to those skilled in the art that other 
embodiments of the present invention may be made by those 
skilled in the art without departing from the teachings of the 
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present invention. It is also clear to those skilled in the art 
that the invention can be used in diverse computer 
applications, and that it is not limited to the use of hash 
tables, but is applicable to other techniques requiring linked 
lists with expiring records. 



Appendix 
Functions Provided 

The following functions are made available to the application program: 

1. insert (record: record_typc) 

Returns replaced if a record associated with recordkey was found and 
subsequently replaced. 

Returns inserted if a record associated with rccord.key was not found and the 
passed record was subsequently inserted. 

Returns full if a record associated with record.key was not found and the passed 
record could not be inserted because no memory is available. 

2. retrieve (record: record_type) 

Returns success if record associated with record-key was found and assigned to 
record. 

Returns failure if search was unsuccessful. 

3. delete (record_Jkey: record_Jcey_type) 

Returns success if record associated with record_Jcey was found and subse- 
quently deleted. 
Returns failure if not found. 

Definitions 

The following formal definitions are required for specifying the insertion, retrieval, and deletion 
procedures. They are global to all procedures and functions shown below. 

1. const table__size /* size of hash table. 7 

2. type list_element_pointer - f list_element /* Pointer to elements of linked list. */ 

3. type list_element - /* Each element of linked list. */ 

record 

record_contents: record_type; 

next: list_element_pointer /* Singly-linked list. •/ 

end 

4. var table: array [0 . . . table_size - 1] of list_element_po inter /* Hash table. V 

/* Each array entry is pointer to head of list */ 
Initial state of table: table[i] = nil V i 0 £ i < table_size /* Initially empty. */ 

Insert Procedure 

function insert (record: record_type): (replaced, inserted, full); 

var position: list_element_pointer; /* Pointer into list of found record, */ 

/* or new element if not found */ 

dummy__pointer: list_element__pointer, /* Don't need position's predecessor. */ 

index: 0 . . . table_sizc - 1; /* Table index mapped to by hash function. */ 

begin 

if search_table (record.key, position, dummy_pointer, index) /* Record already exist? */ 

then begin /* Yes, update it with passed record "/ 

position t .record__contents := record; 
return (replaced) 
end 

else /* No, insert new record at head of list, */ 

if no memory available then return (full) /* if memory available to do so. */ 

else begin /- Memory is available for a node. */ 

new(position); /* Dynamically allocate new node. */ 

position T .record_contents :- record; /* Hook it in. */ 

position T .next :- table[index]; 

table[index] := position; 

return (inserted) 

end /* else begin */ 

end /* insert */ 

Retrieve Procedure 

function retrieve (var record: record__typc): (success, failure); 
var position: list_element__pointer; /* Pointer into list of found record */ 

dummy_pointer: list_element_po inter, /* Don't need position's predecessor. */ 

dummy_indcx: 0 . . . tablets ize - 1; /* Don't need table index mapped to by hash function. */ 

begin 

if search_„table (recordkey, position, dummy_pointer, dummy_index) /* Record exist? •/ 

then begin /* Yes, return it to caller. •/ 

record :- position! .record_contents; 

return (success) 
end 

else return (failure) /* No, report failure. •/ 

end /* retrieve •/ 
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Delete Procedure 

function delete (record_key: record_key_jype): (success,£ailure); 

var position: list_element_pointer; /* Pointer into list of found record. 7 

previous_position: list_element_pointer; /* Points to position's predecessor. "/ 

index: 0 . . . table_size - 1; /* Table index mapped to by hash function. */ 

begin 

if search_table (record__key, position, previous_position, index) /* Record exist? ■/ 

then begin /* Yes, remove it 7 

remove (position, previous_position f index); 

return (success) 
end 

else return (failure) /• No, report failure. 7 

end /* delete •/ 

Search Table Procedure 

function search_table (record__key: record_key_type; 

var position: list_element„pointer, 
var previous_position: list„element_pointen 
var index: 0 . . . table_size - 1): boolean; 
/* Search table for record_key and delete expired records in target list; if found, position is made to 
point to located record and previous_position to its predecessor, and TRUE is returned; otherwise 
FALSE is returned, index is set to table subscript that is mapped to by hash function in either 
case. */ 

var p: list_element_po inter; /• Used for traversing chain. 7 

previous__p: list_element_pointer; /* Points to p's predecessor. 7 

begin 

index := hash (record_key); /* hash returns value in the range 0 . . . table_size - 1. 7 

p := table[index]; /* Initialization before loop. 7 

previous.^) nil; /* Ditto 7 

position := nil; /* Ditto 7 

previous_position nil; /* Ditto 7 

while p * nil /* HEART OF THE TECHNIQUE: Traverse entire list, deleting 7 

/* expired records as we search. 7 

begin 

if pt .record_contents is expired 

then remove (p, previous_p, index) /* ON-THE-FLY REMOVAL OF EXPIRED RECORD! 7 

else begin 

if position = nil then if pt .record_contents.key = record_key 

/* If this is record wanted,"/ 

then begin position := p; previous_jx)sition :» previous__p end; 

/* save its position. 7 

previous_p := p; /* Advance to 7 

p :» pt .next /* next record. 7 

end /* else begin 7 

end; 

return (position * nil) /• Return TRUE if record located, otherwise FALSE. 7 

end /* search_table 7 

Alternate Version of Search Table Procedure 

function search_table (record_Jcey: record_Jcey_type; 

var position: list_element_pointer; 
var previous_position: list_element_pointer; 
var index: 0 . . . table__size - 1): boolean; 
/* SAME AS VERSION SHOWN ABOVE EXCEPT THAT THE SEARCH TERMINATES IF 

RECORD IS FOUND, INSTEAD OF ALWAYS TRAVERSING THE ENTIRE CHAIN. 7 
var p: list_element_po inter; /• Used for traversing chain. 7 

prcvious_p: list_element_pointer; /* Points to p's predecessor. 7 

begin 

index :=> hash (record_key); /* hash returns value in the range 0 . . . table_size - 1. 7 

p :« tablefindex]; /* Initialization before loop. 7 

previous__p :° nil; /* Ditto 7 

position :- nil; /* Ditto "/ 

previous_position :» nil; /* Ditto 7 

while p * nil /* HEART OF THE TECHNIQUE: Traverse list, deleting 7 

/■ expired records as we search. 7 

begin 

if pt .record_contents is expired 

then remove (p, previous__p, index) /* ON-THE-FLY REMOVAL OF EXPIRED RECORDI 7 

else begin 

if p t .record_contents.key - reoord_Jcey /* If this is record wanted,"/ 

then begin /• save its position. "/ 

position p; 

previous position previous_p; 

return (true) /* We found the record, so terminate search. "/ 

end; 

previous_p > p; /* Advance to 7 

p :- pt .next /* next record. 7 
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end /* e i se begin •/ 

end; 

return (false) /* Record not found. */ 

end /* search_table •/ 

Remove Procedure 

procedure remove (var eIem_to_del: list_element__pointer; 

previous__elem: list_eIcment_po inter, 

index: 0 . . . table_size - 1); 
/* Delete elem_to_del t from list, advancing elem_to_del to next element. previous_elem points to 

elem__to_deFs predecessor, or nil if elem_to_delt is l 8t element in list.*/ 
var p: list_element_pointer; /* Save pointer to elem_to_dcl for disposal. 7 

begin 

p : = elem__to„del; /* Save so we can dispose when finished adjusting pointers. */ 

elem_to_del : = elem__to_del t .next; 

if previous_elem = nil /* Deleting l" element requires changing 7 

then tablefindex] := elem_to__del /• head pointer, as opposed to */ 

else previous_elemt .next := etem_to_del; /* predecessor's next pointer. */ 

dispose (p) /* Dynamically de-allocate node. */ 

end /* remove*/ 



I claim: 

1. An information storage and retrieval system, the system 
comprising: 

a linked list to store and provide access to records stored 
in a memory of the system, at least some of the records 
automatically expiring, 

a record search means utilizing a search key to access the 
linked list, 

the record search means including a means for identifying 
and removing at least some of the expired ones of the 
records from the linked list when the linked list is 
accessed, and 

means, utilizing the record search means, for accessing 
the linked list and, at the same time, removing at least 
some of the expired ones of the records in the linked 
list. 

2. The information storage and retrieval system according 
to claim 1 further including means for dynamically deter- 
mining maximum number for the record search means to 
remove in the accessed linked list of records. 

3. A method for storing and retrieving information records 
using a linked list to store and provide access to the records, 
at least some of the records automatically expiring, the 
method comprising the steps of: 

accessing the linked list of records, 

identifying at least some of the automatically expired ones 

of the records, and 
removing at least some of the automatically expired 

records from the linked list when the linked list is 

accessed. 

4. The method according to claim 3 further including the 
step of dynamically determining maximum number of 
expired ones of the records to remove when the linked list 
is accessed. 

5. An information storage and retrieval system, the system 
comprising: 

a hashing means to provide access to records stored in a 
memory of the system and using an external chaining 



technique to store the records with same hash address, 

at least some of the records automatically expiring, 
25 a record search means utilizing a search key to access a 

linked list of records having the same hash address, 
the record search means including means for identifying 

and removing at least some expired ones of the records 
30 from the linked list of records when the linked list is 

accessed, and 

meals, utilizing the record search means, for inserting, 
retrieving, and deleting records from the system and, at 
the same time, removing at least some expired ones of 
the records in the accessed linked list of records. 

6. The information storage and retrieval system according 
to claim 5 further including means for dynamically deter- 
mining maximum number for the record search means to 

40 remove in the accessed linked list of records. 

7. A method for storing and retrieving information records 
using a hashing technique to provide access to the records 
and using an external chaining technique to store the records 
with same hash address, at least some of the records auto- 

45 matically expiring, the method comprising the steps of: 
accessing a linked list of records having same hash 
address, 

identifying at least some of the automatically expired ones 
5Q of the records, 

removing at least some of the automatically expired 
records from the linked list when the linked list is 
accessed, and 

inserting, retrieving or deleting one of the records from 
55 the system following the step of removing. 

8. The method according to claim 7 further including the 
step of dynamically determining maximum number of 
expired ones of the records to remove when the linked list 
is accessed. 

60 

* * * * * 
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